NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

MDP Geometry, Normalization and Reward Balancing Solvers

Mustafin, Arsenii; Pakharev, Aleksei; Olshevsky, Alex; Paschalidis, Ioannis (March 2025, Proceedings of AISTATS (28th International Conference on Artificial Intelligence and Statistics))

We present a new geometric interpretation of Markov Decision Processes (MDPs) with a natural normalization procedure that allows us to adjust the value function at each state without altering the advantage of any action with respect to any policy. This advantage-preserving transformation of the MDP motivates a class of algorithms which we call Reward Balancing, which solve MDPs by iterating through these transformations, until an approximately optimal policy can be trivially found. We provide a convergence analysis of several algorithms in this class, in particular showing that for MDPs for unknown transition probabilities we can improve upon state-of-the-art sample complexity results.
more » « less
Free, publicly-accessible full text available March 10, 2026
Network-Based Epidemic Control Through Optimal Travel and Quarantine Management

https://doi.org/10.1109/TCNS.2025.3590383

Talaei, Mahtab; Rikos, Apostolos I; Olshevsky, Alex; White, Laura F; Paschalidis, Ioannis Ch (January 2025, IEEE Transactions on Control of Network Systems)

Full Text Available
Optimal Fixed Lockdown for Pandemic Control

https://doi.org/10.1109/TAC.2023.3340556

Ma, Qianqian; Liu, Yang-Yu; Olshevsky, Alex (July 2024, IEEE Transactions on Automatic Control)

Full Text Available
One-Shot Averaging for Distributed TD (λ) Under Markov Sampling

https://doi.org/10.1109/LCSYS.2024.3412648

Tian, Haoxing; Paschalidis, Ioannis Ch; Olshevsky, Alex (June 2024, IEEE Control Systems Letters)

Full Text Available
A Small Gain Analysis of Single Timescale Actor Critic

https://doi.org/10.1137/22M1483335

Olshevsky, Alex; Gharesifard, Bahman (April 2023, SIAM Journal on Control and Optimization)

Full Text Available
Distributed TD(0) With Almost No Communication

https://doi.org/10.1109/LCSYS.2023.3287952

Liu, Rui; Olshevsky, Alex (January 2023, IEEE Control Systems Letters)

Full Text Available
Temporal Difference Learning as Gradient Splitting

Liu, Rui; Olshevsky, Alex (November 2021, International Conference on Machine Learning (ICML))

Temporal difference learning with linear function approximation is a popular method to obtain a low-dimensional approximation of the value function of a policy in a Markov Decision Process. We give a new interpretation of this method in terms of a splitting of the gradient of an appropriately chosen function. As a consequence of this interpretation, convergence proofs for gradient descent can be applied almost verbatim to temporal difference learning. Beyond giving a new, fuller explanation of why temporal difference works, our interpretation also yields improved convergence times. We consider the setting with 1/T^{1/2} step-size, where previous comparable finite-time convergence time bounds for temporal difference learning had the multiplicative factor 1/(1-\gamma) in front of the bound, with γ being the discount factor. We show that a minor variation on TD learning which estimates the mean of the value function separately has a convergence time where 1/(1-\gamma) only multiplies an asymptotically negligible term.
more » « less
Full Text Available
Asymptotic Convergence Rate of Alternating Minimization for Rank One Matrix Completion

https://doi.org/10.1109/LCSYS.2020.3016626

Liu, Rui; Olshevsky, Alex (October 2021, IEEE Control Systems Letters)
null (Ed.)
Full Text Available
Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers

Ma, Yao; Olshevsky, Alex; Saligrama, Venkatesh; Czepesvari, Csoba (June 2021, Journal of machine learning research)

We consider worker skill estimation for the single-coin Dawid-Skene crowdsourcing model. In practice, skill-estimation is challenging because worker assignments are sparse and irregular due to the arbitrary and uncontrolled availability of workers. We formulate skill estimation as a rank-one correlation-matrix completion problem, where the observed components correspond to observed label correlation between workers. We show that the correlation matrix can be successfully recovered and skills are identifiable if and only if the sampling matrix (observed components) does not have a bipartite connected component. We then propose a projected gradient descent scheme and show that skill estimates converge to the desired global optima for such sampling matrices. Our proof is original and the results are surprising in light of the fact that even the weighted rank-one matrix factorization problem is NP-hard in general. Next, we derive sample complexity bounds in terms of spectral properties of the signless Laplacian of the sampling matrix. Our proposed scheme achieves state-of-art performance on a number of real-world datasets.
more » « less
Full Text Available
Communication-efficient SGD: From Local SGD to One-Shot Averaging

Spiridonoff, Artin; Olshevsky, Alex; Paschalidis, Ioannis Ch. (January 2021, Advances in neural information processing systems)

Full Text Available

« Prev Next »

Search for: All records